AI/GPU Infrastructure
GPU scheduling, inference systems, agent runtime, resource abstractions, and platform engineering for AI workloads.
Making AI infrastructure understandable, verifiable, and participatory
I focus on how infrastructure software forms real developer ecosystems in the AI era: turning GPU governance, scheduling, inference, and agent runtime into clear methodology, practical paths, and sustainable open collaboration.
I usually read AI infrastructure through four layers: applications and agents at the top, runtime, inference, training, and governance in the middle, and GPU plus accelerated infrastructure underneath. Each layer needs clear resource boundaries, engineering abstractions, and developer participation paths.
My work is not just explaining technology. It builds judgment, expression, and collaboration around the key links of infrastructure ecosystems.
GPU scheduling, inference systems, agent runtime, resource abstractions, and platform engineering for AI workloads.
How Kubernetes, service mesh, and platform engineering extend toward AI-era resource governance, elasticity, and multi-tenancy.
Documentation, tutorials, contribution paths, community communication, and developer activities that make complex infrastructure learnable and participatory.
A value proposition needs long-term validation. I use writing, books, landscape maps, open-source communities, and developer activities to turn abstract judgment into discussable, learnable, and collaborative material.
Cloud-native and Kubernetes phase: focused on container orchestration and platform fundamentals, including Kubernetes, Cloud Native Go, Cloud Native Java, Cloud Native Patterns, and Cloud Native Infrastructure.
Service mesh and microservices phase: deepened governance and traffic architecture through Istio, migration-focused microservice architecture, and Envoy-centered engineering practices.
AI Native Infra and AI phase: built a methodology from AI engineering to infrastructure with the RAG handbook, agentic design patterns, AI handbook, GPU scheduling/virtualization, AI Native Infrastructure, and AI Infra Dao.
Start from real production pain points before introducing concepts and abstractions.
Break complex topics into resource model, runtime, platform engineering, and governance.
Anchor conclusions in concrete projects, measurable signals, and reproducible cases.
Use posts, books, and talks to cross-validate ideas and continuously refine the boundary of practice.
Turn methodology into execution: a curated and continuously updated directory of open-source AI projects and tools.
Browse a structured AI resource list for agents, AI coding tools, model infrastructure, and engineering workflows.
View AI Resource ListRecent engineering updates and practical notes that continue the research threads above.
Why GPU Is the Foundation of AI
A GPU explainer for Kubernetes veterans new to AI. Maps token, model, training, inference, Transformer, Tensor Core, HBM, and KV cache to concepts you already know.
From GPU utilization to productive GPU-hours.
Jimmy focuses on AI-Native Infrastructure and computing governance, with long-term research on GPU virtualization, heterogeneous scheduling, and system-level architecture for AI workloads. He is Open Source Ecosystem VP at Dynamia.ai, CNCF Ambassador, and founder of the Cloud Native Community (China), and continues driving the shift from cloud-native to AI-native engineering.
